AITopics

Country: Europe > Germany > Saxony-Anhalt (0.14)

Genre:

Research Report > Experimental Study (0.31)
Research Report > New Finding (0.30)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Neural Information Processing SystemsFeb-11-2026, 04:22:22 GMT

4533e4a352440a32558c1c227602c323-Paper-Conference.pdf

cortex, language model, training data, (14 more...)

Country: North America > United States > Texas > Travis County > Austin (0.05)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Neural Information Processing SystemsFeb-9-2026, 13:57:21 GMT

SupplementaryMaterial: Interpretable multi-timescalemodelsforpredictingfMRIresponses tocontinuousnaturalspeech

Only significantly predicted voxels are shown. Note that subject S04 is excluded from this study due to poor data quality, resulting in 6 subjects overall. Results from subject S03 (highest number of significant voxels) are shown in the main text. It is noted that several voxels in the auditory cortex (AC) have a preferencefor longCLs. To investigate this, we look at the encoding performance of different CL models.

artificial intelligence, machine learning, theuniversityoftexasataustin austin, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

arXiv.org Artificial IntelligenceDec-2-2025

fMRI2GES: Co-speech Gesture Reconstruction from fMRI Signal with Dual Brain Decoding Alignment

Zhu, Chunzheng, Shao, Jialin, Lin, Jianxin, Wang, Yijun, Wang, Jing, Tang, Jinhui, Li, Kenli

Understanding how the brain responds to external stimuli and decoding this process has been a significant challenge in neuroscience. While previous studies typically concentrated on brain-to-image and brain-to-language reconstruction, our work strives to reconstruct gestures associated with speech stimuli perceived by brain. Unfortunately, the lack of paired \{brain, speech, gesture\} data hinders the deployment of deep learning models for this purpose. In this paper, we introduce a novel approach, \textbf{fMRI2GES}, that allows training of fMRI-to-gesture reconstruction networks on unpaired data using \textbf{Dual Brain Decoding Alignment}. This method relies on two key components: (i) observed texts that elicit brain responses, and (ii) textual descriptions associated with the gestures. Then, instead of training models in a completely supervised manner to find a mapping relationship among the three modalities, we harness an fMRI-to-text model, a text-to-gesture model with paired data and an fMRI-to-gesture model with unpaired data, establishing dual fMRI-to-gesture reconstruction patterns. Afterward, we explicitly align two outputs and train our model in a self-supervision way. We show that our proposed method can reconstruct expressive gestures directly from fMRI recordings. We also investigate fMRI signals from different ROIs in the cortex and how they affect generation results. Overall, we provide new insights into decoding co-speech gestures, thereby advancing our understanding of neuroscience and cognitive science.

fmri signal, machine learning, natural language, (20 more...)

doi: 10.1109/TCSVT.2025.3558125

2512.01189

Country:

Asia > China (1.00)
North America (0.93)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Pepino, Leonardo, Riera, Pablo, Kamienkowski, Juan, Ferrer, Luciana

Better audio representations are more brain-like: linking model-brain alignment with performance in downstream auditory tasks

arXiv.org Artificial IntelligenceNov-24-2025

Artificial neural networks (ANNs) are increasingly powerful models of brain computation, yet it remains unclear whether improving their task performance also makes their internal representations more similar to brain signals. To address this question in the auditory domain, we quantified the alignment between the internal representations of 36 different audio models and brain activity from two independent fMRI datasets. Using voxel-wise and component-wise regression, and representation similarity analysis (RSA), we found that recent self-supervised audio models with strong performance in diverse downstream tasks are better predictors of auditory cortex activity than older and more specialized models. To assess the quality of the audio representations, we evaluated these models in 6 auditory tasks from the HEAREval benchmark, spanning music, speech, and environmental sounds. This revealed strong positive Pearson correlations ($r>0.7$) between a model's overall task performance and its alignment with brain representations. Finally, we analyzed the evolution of the similarity between audio and brain representations during the pretraining of EnCodecMAE. We discovered that brain similarity increases progressively and emerges early during pretraining, despite the model not being explicitly optimized for this objective. This suggests that brain-like representations can be an emergent byproduct of learning to reconstruct missing information from naturalistic audio data.

artificial intelligence, machine learning, natural language, (21 more...)

2511.16849

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsOct-8-2025, 14:07:29 GMT

Scaling laws for language encoding models in fMRI

Here we test whether these scaling "laws" hold for the task of modeling the human brain.

cortex, language model, training data, (14 more...)

Country: North America > United States > Texas > Travis County > Austin (0.05)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Al-Tahan, Haider, Deb, Mayukh, Feather, Jenelle, Murty, N. Apurva Ratan

End-to-end Topographic Auditory Models Replicate Signatures of Human Auditory Cortex

arXiv.org Artificial IntelligenceSep-30-2025

The human auditory cortex is topographically organized. Neurons with similar response properties are spatially clustered, forming smooth maps for acoustic features such as frequency in early auditory areas, and modular regions selective for music and speech in higher-order cortex. Yet, evaluations for current computational models of auditory perception do not measure whether such topographic structure is present in a candidate model. Here, we show that cortical topography is not present in the previous best-performing models at predicting human auditory fMRI responses. To encourage the emergence of topographic organization, we adapt a cortical wiring-constraint loss originally designed for visual perception. The new class of topographic auditory models, TopoAudio, are trained to classify speech, and environmental sounds from cochleagram inputs, with an added constraint that nearby units on a 2D cortical sheet develop similar tuning. Despite these additional constraints, TopoAudio achieves high accuracy on benchmark tasks comparable to the unconstrained non-topographic baseline models. Further, TopoAudio predicts the fMRI responses in the brain as well as standard models, but unlike standard models, TopoAudio develops smooth, topographic maps for tonotopy and amplitude modulation (common properties of early auditory representation, as well as clustered response modules for music and speech (higher-order selectivity observed in the human auditory cortex). TopoAudio is the first end-to-end biologically grounded auditory model to exhibit emergent topography, and our results emphasize that a wiring-length constraint can serve as a general-purpose regularization tool to achieve biologically aligned representations.

artificial intelligence, machine learning, natural language, (18 more...)

2509.24039

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Neural Information Processing SystemsAug-15-2025, 10:55:54 GMT

Supplementary Material: Interpretable multi-timescale models for predicting fMRI responses to continuous natural speech

Additional subject flatmaps are shown in figures 2-7 at the end of the document. Only significantly predicted voxels are shown. These flatmaps correspond to figures 3-5 in the main text and follow the same colormap. Note that subject S04 is excluded from this study due to poor data quality, resulting in 6 subjects overall. Results from subject S03 (highest number of significant voxels) are shown in the main text.

main text, timescale, voxel, (13 more...)

Country:

North America > United States > Texas > Travis County > Austin (0.18)
North America > United States > Oregon > Washington County > Hillsboro (0.05)

Industry:

Health & Medicine > Health Care Technology (0.52)
Health & Medicine > Therapeutic Area > Neurology (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Moussa, Omer, Toneva, Mariya

Brain-tuned Speech Models Better Reflect Speech Processing Stages in the Brain

arXiv.org Artificial IntelligenceJun-5-2025

Pretrained self-supervised speech models excel in speech tasks but do not reflect the hierarchy of human speech processing, as they encode rich semantics in middle layers and poor semantics in late layers. Recent work showed that brain-tuning (fine-tuning models using human brain recordings) improves speech models' semantic understanding. Here, we examine how well brain-tuned models further reflect the brain's intermediate stages of speech processing. We find that late layers of brain-tuned models substantially improve over pretrained models in their alignment with semantic language regions. Further layer-wise probing reveals that early layers remain dedicated to low-level acoustic features, while late layers become the best at complex high-level tasks. These findings show that brain-tuned models not only perform better but also exhibit a well-defined hierarchical processing going from acoustic to semantic representations, making them better model organisms for human speech processing.

artificial intelligence, machine learning, natural language, (20 more...)

2506.03832

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Vattikonda, Nishitha, Vaidya, Aditya R., Antonello, Richard J., Huth, Alexander G.

BrainWavLM: Fine-tuning Speech Representations with Brain Responses to Language

arXiv.org Artificial IntelligenceFeb-12-2025

Speech encoding models use auditory representations to predict how the human brain responds to spoken language stimuli. Most performant encoding models linearly map the hidden states of artificial neural networks to brain data, but this linear restriction may limit their effectiveness. In this work, we use low-rank adaptation (LoRA) to fine-tune a WavLM-based encoding model end-to-end on a brain encoding objective, producing a model we name BrainWavLM. We show that fine-tuning across all of cortex improves average encoding performance with greater stability than without LoRA. This improvement comes at the expense of low-level regions like auditory cortex (AC), but selectively fine-tuning on these areas improves performance in AC, while largely retaining gains made in the rest of cortex. Fine-tuned models generalized across subjects, indicating that they learned robust brain-like representations of the speech stimuli. Finally, by training linear probes, we showed that the brain data strengthened semantic representations in the speech model without any explicit annotations. Our results demonstrate that brain fine-tuning produces best-in-class speech encoding models, and that non-linear methods have the potential to bridge the gap between artificial and biological representations of semantics.

artificial intelligence, machine learning, natural language, (16 more...)

2502.08866

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
North America > United States > Hawaii (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)